By Using Existing Processor Resources More Efficiently, Hyperthreading Technology Improves Performance at Little Cost and Increases Chip Size by Less Than
نویسندگان
چکیده
Hyperthreading technology, which brings the concept of simultaneous multithreading to the Intel architecture, was first introduced on the Intel Xeon processor in early 2002 for the server market. In November 2002, Intel launched the technology on the Intel Pentium 4 at clock frequencies of 3.06 GHz and higher, making the technology widely available to the consumer market. This technology signals a new direction in microarchitecture development and fundamentally changes the cost-benefit tradeoffs of microarchitecture design choices. This article describes how the technology works, that is, how we make a single physical processor appear as multiple logical processors to operating systems and software. We highlight the additional structures and die area needed to implement the technology and discuss the fundamental ideas behind the technology and why we can get a 25-percent boost in performance from a technology that costs less than 5 percent in added die area. We illustrate the importance of choosing the right sharing policy for each shared resource by describing, examining, and comparing three different sharing policies: partitioned resources, threshold sharing, and full sharing. The choice of policy depends on the traffic pattern, complexity and size of the resource, potential deadlock/livelock scenarios, and other considerations. Finally, we show how this technology significantly improves performance on several relevant workloads. The die photos and descriptions in this article illustrate the technology’s first implementations on Intel’s Xeon and Pentium 4 processor families. These first implementations emphasized cost containment. Future implementations should provide even greater performance benefits.
منابع مشابه
CAFT: Cost-aware and Fault-tolerant routing algorithm in 2D mesh Network-on-Chip
By increasing, the complexity of chips and the need to integrating more components into a chip has made network –on- chip known as an important infrastructure for network communications on the system, and is a good alternative to traditional ways and using the bus. By increasing the density of chips, the possibility of failure in the chip network increases and providing correction and fault tol...
متن کاملGuest Editors' Introduction: Interaction of Many-Core Computer Architecture and Operating Systems
......Practices and conventions in designing a microprocessor are undergoing dramatic changes. Squeezing more performance from a single, complex, and powerhungry core has become a difficult proposition, whereas incorporating multiple processor cores on a silicon die is relatively simple due to the continuing advances in process technology. Major microprocessor vendors are currently marketing mu...
متن کاملUnauthenticated event detection in wireless sensor networks using sensors co-coverage
Wireless Sensor Networks (WSNs) offer inherent packet redundancy since each point within the network area is covered by more than one sensor node. This phenomenon, which is known as sensors co-coverage, is used in this paper to detect unauthenticated events. Unauthenticated event broadcasting in a WSN imposes network congestion, worsens the packet loss rate, and increases the network energy con...
متن کاملA High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure
The IP Lookup Process is a key bottleneck in routing due to the increase in routing table size, increasing traıc and migration to IPv6 addresses. The IP address lookup involves computation of the Longest Prefix Matching (LPM), which existing solutions such as BSD Radix Tries, scale poorly when traıc in the router increases or when employed for IPv6 address lookups. In this paper, we describe a ...
متن کاملA High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure
The IP Lookup Process is a key bottleneck in routing due to the increase in routing table size, increasing traıc and migration to IPv6 addresses. The IP address lookup involves computation of the Longest Prefix Matching (LPM), which existing solutions such as BSD Radix Tries, scale poorly when traıc in the router increases or when employed for IPv6 address lookups. In this paper, we describe a ...
متن کامل